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^ (57) Abstract: A human computer interface device is provided in which the operation of the user interface depends upon detected 
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Q attributes are also used to modify the operation of the user interface. Also provided is a mobile conferencing device incorporating 
such a human conmputer interface device. In this case the ring-tone or a visual display can be tailored according to the detected 

^ location. 
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PERSONAL MOBILE COMMUNICATION DEVICE 



This invention relates to a device in which the user interface of a mobile personal 
device is modified according to physical and location context. In particular this invention 
5 relates to a mobile teleconferencing device. In a telecommunications conferencing 
(teleconferencing) facility images are generated relating to a "virtual meeting space". 
Individuals at a plurality of locations remote from each other, and accessing the facility 
using different types of access device may interact with each other in a manner which 
emulates a conventional meeting. When the user is using a teleconferencing facility the 
10 physical and location attributes may be used to modify a representation of the user. The 
detected physical and location attributes may also be used to modify the interface of the 
teleconferencing device. 

Individual users are represented In the virtual meeting space display by 
15 computer-generated representations of the users, known as "avatars" (or "icons"). These 
may be derived from video images of the users, either live or retrieved from a store, but 
usually they are digitally generated representations. In general, each user is able to 
select the appearance of his or her avatar in the virtual space from a menu of 
characteristics. Alternatively, each individual user may be able to select, for his own 
20 viewpoint, how each of the other users' avatars will appear. Other characteristics of the 
meeting space, such as the colour and shape of the elements of the meeting space, may 
also be selectable by the user. 

According to the present invention there is provided a human computer interface device 
25 comprising a user interface device comprising a visual display device and an audio output 
device; and a physical detector for detecting physical attributes of a user; in which the 
visual display device is arranged to inhibit output via the visual display device when the 
user is not stationary. 

30 In a preferred embodiment the device further comprises a location detector for detecting 
location attribute of the user and in which the operation of the user interface device 
dependent upon the detected location attributes of the user. 



WO 01/29642 



PCT/GB00/03970 



2 

Preferably the output of the audio output device is dependent upon the location attributes 
of the user, and preferably the output of the visual display device is dependent upon the 
location attributes of the user. 

5 According to another aspect of the invention there is provided a human computer interface 
device comprising a user interface device comprising a visual display device and an audio 
output device; a physical detector for detecting physical attributes of a user; and a location 
detector for detecting location attributes of the user and in which the operation of the user 
interface device dependent upon the detected location attributes of the user. 

10 

Preferably the output of the audio output device is dependent upon the location attributes 
of the user, and preferably the output of the visual display device is dependent upon the 
location attributes of the user. 

15 According to the invention there is also provided a mobile conferencing device including 
such a human computer interfacing device. 

An embodiment of the Invention will now be described by way of example only with 
reference to the accompanying drawings, in which: 

20 

Figure 1 shows a network with human/machine interface units serving teleconference 
users via respective client apparatuses; 

Figure 2 is a representation of a teleconference as displayed on an interface unit of Figure 
1; 

25 Figure 3a is a block diagram showing a client apparatus of Figure 1 which incorporates a 
physical and location sensor; 

Figure 3b is a functional block diagram showing the logical operation of the apparatus 
shown in Figure 3a; and 

Figures 4 to 7 are examples of representations of a user as shown on an interface unit of 
30 Figure 1, in which the representation of the user is dependent upon location and physical 
data collected using the apparatus shown in Figure 3a. 

Figure 1 shows a network serving four users 1, 2, 3, 4 (not shown) allowing them 
to interact in a virtual teleconference. Each user has a respective human/machine 
35 interface unit 21, 22, 23, 24, which includes video and/or audio equipment for the user to 
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see and/or hear what is happening in the virtual meeting space. The interface unit 
includes user input devices (e.g. audio input, keyboard or keypad, computer "mouse" etc.) 
to enable the user to provide input to the virtual meeting space. Each interface unit, 21, 
22, 23, 24 is connected to a respective client apparatus 11, 12, 13, 14 which provides an 
5 interface between the user and a main server 10 which controls the operation of the 
meeting space. The server 10 has t as a further input, a virtual reality (VR) definition store 
30 which maintains permanent data defining the virtual meeting space (also referred to as 
the meeting space definition unit in the specification). The control of the meeting space is 
carried out by interaction between the client apparatuses 1 1 , 12, 13, 14 and the server 10. 

10 The display control functions may take place in the server 10, or the display control 
functions may be distributed in the client apparatus 11, 12, 13, 14, depending on the 
functionality available in the client apparatus. Links between the client apparatus 11, 12, 
13, 14 and the server 10 may be permanent hard-wired connections, virtual connections 
(permanent as perceived by the user, but provided over shared lines by the 

15 telecommunications provider), or dial-up connections (available on demand, and provided 
on a pay-per-use basts), and may include radio links, for example to a mobile device. The 
server 1 0 may have, in addition to the server functionality, similar functionality to the client 
apparatus 11, 12, 13, 14, but as shown the server 10 is dedicated to the server function 
only. 

20 

An example of an image representing a meeting space as it appears on a display 
device is shown in Figure 2. In this example, users 2, 3 and 4 are represented by avatars 
42, 43 and 44 respectively. 

25 Referring again to Figure 1, in response to inputs from one of the users (e.g. user 

1) through his respective user interface 21 the client apparatus 11 transmits these inputs 
to the main server 10 which, in accordance with the meeting space definition unit 30, 
controls the images to be represented on the other users' screens in the human machine 
interface units 22, 23, 24 to represent the activities of the user 1, input through interface 

30 device 21. As a very simple example, the actions of the user 1 when first establishing 
contact with the meeting space are translated by the client apparatus 1 1 and converted by 
the server 10, into a representation of the user 1 entering the meeting space, which is in 
turn passed to the individual clients 12, 13, 14 to be represented as the avatar of the user 
1 moving into the field of view of the display devices 22, 23, 24. 
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The manner of representation of the individual user 1 in the virtual space, for 
example the appearance of the avatar in terms of age, sex, hair colour etc may be 
selected either by the user 1 through his respective client device 1 1 , or by each receiving 
user 2, 3, 4 in the meeting space, who may each select an avatar according to his own 
5 requirements to represent the user 1. Similarly, some parts of the virtual meeting space 
may be defined centrally in the meeting space definition unit 30, whereas other aspects 
may be defined by each individual client apparatus 11, 12, 13, 14 independently of the 
others. Such definitions may include colour schemes, the relative locations In the virtual 
meeting space of the individual users 1, 2, 3, 4, etc. 

10 

The client apparatus 11 is a mobile device, and in the embodiment of the 
invention described here the mobile device 11 is a wireless palmtop computer. In this 
specification the term mobile device is intended to refer to all computing devices which 
may be carried around or worn by a user, and may be used whilst the user is moving 
15 around and active in other tasks. Mobile devices are distinguished from portable devices 
which are carried to a location and then used whilst the user is stationary. 

However, a mobile device may or may not have visual display capabilities. Even 
if the device does have such capabilities, the user 1 may be walking or running or 
20 otherwise distracted, and may not be able to attend to a visual display. The representation 
of the user 1 is displayed to the other users 2, 3, 4 as shown in Figure 4, so that the other 
users are aware that user 1 is on line, but that the user 1 may not have a visual link to the 
teleconference. 

25 For users using a mobile device there are other aspects of the service to consider 

beside the fact that the client device 1 1 may not have as sophisticated input and output 
capabilities as other client devices 12, 13, 14 . Privacy may be an issue. It is possible that 
other people might move in and out of the user's proximity during a conversation. In order 
to make the other users in a conference aware of potential privacy issues the user's 

30 avatar is changed as shown in Figure 5 to indicate that the user is on line, but that the 
user may not be in private. The user 1 can indicate that there is a privacy issue manually, 
by transmitting a signal via the client 11 to the server 10 using a predetermined key or 
sequence of keys. The device 1 1 has an audio input, and as an alternative to using a 
manually entered key or sequence of keys to indicate the user is not in private, the 

35 received audio signal is analysed, using known speaker recognition algorithms, to 
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determine whether speech other than that from the user is detected. The device 1 1 may 
also be equipped with a video input, in which case the video signal received via the video 
input can be analysed using known image classification algorithms, for example to detect 
wither there is skin detected in the captured image, or to detect the number of faces in the 
5 captured image. The results of such image classification may then be used to indicate to 
the server 10 that the user is not in private and the user's avatar is modified accordingly. 

Another issue which is relevant to mobile users using radio links to access the 
virtual meeting space is Quality of Service (QoS). The fixed telephony network uses 

10 64Kbits/s per voice channel while the mobile network uses 9.6 Kbits/s per voice channel. 
The average number of bits per second transmitted from the client device 1 1 to the server 
10 is monitored by the server 10. The avatar of the user 1 is modified to be more or less 
opaque as a function of the average number of bits per second received by the server 10 
from the client device 1 1 . Hence the opacity of the avatar representing the user 1 related 

15 to the QoS as perceived by other users 2, 3, 4. In this embodiment of the invention the 
more opaque the avatar the better the perceived QoS. 

For a mobile user, the attention paid to the virtual meeting space varies in 
dependence upon the 'real world* task currently being carried out. For example, whilst 

20 travelling on a train a user may be required to show a ticket to the ticket inspector, or 
somebody may speak to the user to ask the time. If the user is walking, running, or unable 
to remain still for some reason, then the attention paid to the virtual meeting space will be 
more limited than otherwise. If the user is in a noisy environment, again, the attention paid 
to the virtual meeting space will be less than it would be in a very quiet environment. 

25 Detection of a user's physical and location attributes is discussed in more detail with 
reference to Figure 3a and 3b. 

The audio environment is analysed using the audio signal received via the audio 
input on the client apparatus 1 1 . Jt is also possible for the user to use a predetermined key 
30 or sequence of keys to indicate via the client apparatus 11 to the server 10 that he is 
distracted or on the move. Figure 6 shows a representation of a user who is on-line but 
distracted, and Figure 7 shows a representation of a user who is on line but on the move. 

The user interface unit 21 includes a physical and location sensor 50 as shown in Figure 
35 3, as well. as a visual display 60 and an audio input/output device 61. The physical and 
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location sensor 50 is connected to the client apparatus 11 by a serial interface 51. A low 
acceleration detector 52 measures acceleration of a low force in two directions using an 
ADXL202. A high acceleration detector 53 measures acceleration of a high force in three 
directions using an ACH04-O8-05 available from Measurement Specialities Incorporated 
5 (which can be referenced via Universal Resource Locator (URL) httD-yAvww.msiusa.com 
on the Internet). A direction detector 54 is provided using a compass which gives an 
absolute measurement of orientation of the client apparatus. A HMC2003, available from 
Honywell (URL http:/ /www.ssechonevwell.com) . is used. The compass is a three-axis 
magnetometer sensitive to fields along the length, width and height of the device. A 

10 direction and velocity detector 55 is provided using an ENC Piezoelectric Vibrating 
Gyroscope (part number S42E-2 which is sold under the registered trademark 
GYROSTAR) available from Murata manufacturing Company Ltd. (URL 
http:/ /www.murata.com) . The gyroscope measures angular velocity, giving speed and 
direction in two directions in each axis of rotation (i. e. six measurements are provided). 

15 The acceleration detectors 52, 53, the direction detector 54 and the velocity and direction 
detector 55 are connected via a multiplexer (MUX) 56 to a microcontroller 57 where the 
outputs are analysed as will be described later. 

A global position detector 58 is provided which measures the absolute location of 
20 the device using a Global Positioning System (GPS) receiver which receives signal from 
GPS satellites. 

GPS provides specially coded satellite signals that can be processed in a GPS 
receiver, enabling the receiver to compute position, velocity and time. The nominal GPS 

25 Operational Constellation consists of 24 satellites that orbit the earth twice a day, 1 1 ,000 
miles above the earth. (There are often more than 24 operational satellites as new ones 
are launched to replace older satellites.) The satellite orbits repeat almost the same 
ground track (as the earth turns beneath them) once each day. There are six orbital 
planes (with nominally four satellites in each), equally spaced (60 degrees apart), and 

30 inclined at about fifty-five degrees with respect to the equatorial plane. This constellation 
provides the user with from five to eight satellites visible from any point on the earth. The 
GPS satellites orbit the earth transmitting their precise position and elevation. A GPS 
receiver acquires the signal, then measures the interval between transmission and receipt 
of the signal to determine the distance between the receiver and the satellite. Once the 
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receiver has calculated this data for at least 3 satellites, its location on the earth's surface 
can be determined. 

The receiver used in this embodiment of the invention is a Garmin GPS35 unit 
5 (available, for example from Lowe Electronics Ltd in the UK). GPS signals do not 
propagate inside buildings so a local position detector 59 is also provided which uses local 
area beacons (LAB's) (not shown) which use low power 418MHz AM radio transmitters 
(such as the CR91Y, CR72P, CR73Q or CR74R from RF Solutions) at known locations 
within a building. Radio or infrared transmitters could be used, although radio provides a 
10 more robust solution since line of sight connections are not required. 

Once the "Bluetooth" radio based system becomes available this will also provide 
a suitable solution. Bluetooth is a standard for wireless connectivity, designed to replace 
cables between portable consumer devices such as cellular phones, laptop computers, 
15 personal digital assistants, digital cameras, and many other products. The Bluetooth 
version 1.0 specification was agreed in July 1999, and the first products are expected on 
the market in mid 2000. 

Software on the microcontroller 57 gathers sensor data from the detectors 52, 53, 
20 54, 55, via the MUX 56 which is configured to read each device in turn via an analogue 
port. The output from the global position detector 58 is read via a serial port connection 
and the output from the local position detector 59* is connected to a digital input on the 
microcontroller 57. Also provided is a location database 64 which is accessed by the 
microcontroller 57 to determine location names. 

25 

Figure 3b is a functional block diagram showing the logical operation of the physical and 
location detector 50. A location agent 62, implemented in software on the microcontroller 
57, uses location data gathered by the global position detector 58 and the local position 
detector 59, analyses this data and makes the analysis available to the client apparatus 
30 11. The location agent 62 also receives information about velocity and direction, 
measured by the direction detector 54 and the velocity and direction detector 55, from a 
physical agent 63. The physical agent is also implemented in software in the 
microcontroller 57. 
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The location agent determines whether GPS is available, and whether the global 
location measured by the global position detector 58 is based on a signal from three or 
more satellites. The local position detector 59 detects signals from LAB's, each of which 
has a unique identifier. The location agent 62 accesses the location database 64 to 
5 determine a location name associated with a received LAB identifier. The location agent 
62 must be able to determine the following: 

• Is the device inside or outside? If less than three GPS signals are received then the 
device is determined to be inside. 

10 • Is the device moving? A measured velocity from the global position detector 58 (if the 
device is outside) and velocity measured via the physical agent 63 are used to 
determine whether the device is moving. 

• Location of the device. Latitude and longitude, if the device is outside, are measured 
via the global position detector 58 and/or a location name is determined using the local 

1 5 position detector 59 and the location database 64. 

• Direction of movement. This may be determined by the global position detector and /or 
by direction data received from the physical agent. 

The physical agent 63 analyses physical sensor data and makes this available to 
20 the location agent 62. The physical agent is used to determine the following user 
attributes. 

• Standing. 

• Walking. 
25 • Sitting. 

• Cadence (velocity). 

• Acceleration. 

• Shock. 

30 The complex nature of the physical data makes the use of simple rules unreliable. The 
physical agent 63 of this embodiment of the invention uses Hidden Markov Models (HMM) 
to provide a determination above based on the inputs from the detectors 52, 53, 54, 55, 
56. A good description of an implementation of HMM's (as applied to speech recognition, 
but the principles are the same) may be found in "Hidden Markov Models for Automatic 
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Speech Recognition: Theory and Application" SJ. Cox, British Telecom Technology 
Journal Vol. 6, No* 2, April 1988. In other embodiments of the invention it is possible for 
the physical agent to analyse visual and audio information received from the visual and 
audio input/output device provided as part of the interface unit 21. 

5 

The client apparatus 1 1 has the physical information made available to it via the 
physical agent 63, and the location information made available to it via the location agent 
62. Audio and/or visual information is used on the mobile device to provide the user with 
Information alerts, and for teleconferencing activity. Spatial audio is also used for 
10 information alerts and for spatialised teleconferencing, which appears more natural to the 
user. 

The interface used by the device for information alerts, and the interface used for 
teleconferencing are dependent on the user's current location and physical context (I. e. is 

15 the user standing/ walking/sitting etc). If the user is unlikely to be able to attend to a visual 
display, an audio interface is used. If the user is likely to be unavailable (eg running) then 
the device could divert alerts to a messaging service, which could then alert the user 
when it is determined he is available again. In embodiments of the invention incorporating 
audio input and analysis it is also possible to configure the audio output on the user's 

20 wearable or handheld device to match the acoustics, ambient noise level etc of the real 
world space in which the user is located. The nature of the interface used (for example the 
sound of a mobile device's alert or 'ring-tone') can be modified according to the detected 
user location. For example, a mobile phone handset could use a ring-tone such as a voice 
saying "shop at the Harrods' sale" if it is determined by the location agent 62 that the user 

25 is walking along Knightsbridge (where the famous shop 'Harrods' is located). A phone 
could use an appropriate piece of music if it Is determined by the location agent 62 that 
the user Is in church. Similarly to changing the users' audio interface in dependence on 
the detected location, the visual display can be altered according to the determined 
location. The screen style of the visual Interface can be made to reflect the theme of the 

30 location. For example if the user is viewing web pages, and is walking around a museum, 
the web pages viewed as the user moves to different locations change to reflect the area 
of the museum. 

In embodiments of the invention including the analysis of visual and audio 
35 information received from a visual and audio input/output device provided as part of the 
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interface unit 21, it is possible to use standard speech and video analysis algorithms to 
provide a more sophisticated interface to the user. There are standard algorithms for 
identifying speech within an audio stream so it would be possible to make a mobile phone 
handset that auto diverted or changed ring tone if the user is detected to be currently in 
5 conversation with someone. Visual information can also be analysed using standard 
algorithms such as skin detection or face detection and this information can be used along 
with audio analysis to infer whether the user is likely to be in private, for example. 
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CLAIMS 



1 . A human computer interface device comprising 

a user interface device comprising a visual display device and an audio 
5 output device; and 

a physical detector for detecting physical attributes ot a user; 

in which the visual display device is arranged to inhibit output via the visual 
display device when the user is not stationary. 

10 2. A device according to claim 1 , further comprising a location detector for detecting 
location attributes of the user and in which the operation of the user interface device 
dependent upon the detected location attributes of the user. 

3. A device according to claim 2 in which the output of the audio output device is 
1 5 dependent upon the location attributes of the user. 

4. A device according to claim 2 or claim 3 in which the output of the visual display 
device is dependent upon the location attributes of the user. 

20 5. A human computer interface device comprising 

a user interface device comprising a visual display device and an audio 

output device; 

a physical detector for detecting physical attributes of a user, and 
a location detector for detecting location attributes of the user and in which 
25 the operation of the user interface device dependent upon the detected location attributes 
of the user. 

6. A device according to claim 5 in which the output of the audio output device is 
dependent upon the location attributes of the user. 

30 

7. A device according to claim 5 or claim 6 in which the output of the visual display 
device is dependent upon the location attributes of the user. 



35 



8. A mobile conferencing device Including a human computer interface device 
according to any one of the preceding claims. 
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